Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods

نویسندگان

  • Jascha Sohl-Dickstein
  • Ben Poole
  • Surya Ganguli
چکیده

We present an algorithm for minimizing a sum of functions that combines the computational efficiency of stochastic gradient descent (SGD) with the second order curvature information leveraged by quasi-Newton methods. We unify these disparate approaches by maintaining an independent Hessian approximation for each contributing function in the sum. We maintain computational tractability and limit memory requirements even for high dimensional optimization problems by storing and manipulating these quadratic approximations in a shared, time evolving, low dimensional subspace. This algorithm contrasts with earlier stochastic second order techniques that treat the Hessian of each contributing function as a noisy approximation to the full Hessian, rather than as a target for direct estimation. Each update step requires only a single contributing function or minibatch evaluation (as in SGD), and each step is scaled using an approximate inverse Hessian and little to no adjustment of hyperparameters is required (as is typical for quasi-Newton methods). We experimentally demonstrate improved convergence on seven diverse optimization problems. The algorithm is released as open source Python and MATLAB packages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Probabilistic Optimization from Noisy Gradients

Stochastic gradient descent remains popular in large-scale machine learning, on account of its very low computational cost and robustness to noise. However, gradient descent is only linearly efficient and not transformation invariant. Scaling by a local measure can substantially improve its performance. One natural choice of such a scale is the Hessian of the objective function: Were it availab...

متن کامل

Stochastic quasi-Newton with adaptive step lengths for large-scale problems

We provide a numerically robust and fast method capable of exploiting the local geometry when solving large-scale stochastic optimisation problems. Our key innovation is an auxiliary variable construction coupled with an inverse Hessian approximation computed using a receding history of iterates and gradients. It is the Markov chain nature of the classic stochastic gradient algorithm that enabl...

متن کامل

New Quasi-Newton Optimization Methods for Machine Learning

This thesis develops new quasi-Newton optimization methods that exploit the wellstructured functional form of objective functions often encountered in machine learning, while still maintaining the solid foundation of the standard BFGS quasi-Newton method. In particular, our algorithms are tailored for two categories of machine learning problems: (1) regularized risk minimization problems with c...

متن کامل

Global convergence of online limited memory BFGS

Global convergence of an online (stochastic) limited memory version of the Broyden-FletcherGoldfarb-Shanno (BFGS) quasi-Newton method for solving optimization problems with stochastic objectives that arise in large scale machine learning is established. Lower and upper bounds on the Hessian eigenvalues of the sample functions are shown to suffice to guarantee that the curvature approximation ma...

متن کامل

Performance Profiles of Line-search Algorithms for Unconstrained Optimization

The most important line-search algorithms for solving large-scale unconstrained optimization problems we consider in this paper are the quasi-Newton methods, truncated Newton and conjugate gradient. These methods proved to be efficient, robust and relatively inexpensive in term of computation. In this paper we compare the Dolan-Moré [11] performance profile of line-search algorithms implemented...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014